Everything Totally Explained


Ask & we'll explain, totally!
Likelihood function
Totally Explained


  NEW! All the latest news in the worlds of computer gaming, entertainment, the environment,  
finance, health, politics, science, stocks & shares, technology and much, much, more.  


View this entry using RSS

Everything about Likelihood Function totally explained

In statistics, the likelihood function (often simply the likelihood) is a function of the parameters of a statistical model that plays a key role in statistical inference. In non-technical usage, "likelihood" is a synonym for "probability", but throughout this article only the technical definition is used. Informally, if "probability" allows us to predict unknown outcomes based on known parameters, then "likelihood" allows us to estimate unknown parameters based on known outcomes.
   In a sense, likelihood works backwards from probability: given B, we use the conditional probability P(A|B) to reason about A, and given A, we use the likelihood function L(B|A) to reason about B. This mode of reasoning is formalized in Bayes' theorem:
» P(B mid A) = fracmid p_H=0.5) =0.25.

But this isn't the same as saying that the probability of pH = 0.5, given the observation, is 0.25.
   To take an extreme case, on this basis we can say "the likelihood of pH = 1 given the observation 'HH' is 1". But it's clearly not the case that the probability of pH = 1 given the observation is 1: the event 'HH' can occur for any pH > 0 (and often does, in reality, for pH roughly 0.5). If the probability of pH = 1 given the observation is 1, it means that pH must and can only be equal 1 for event 'HH' to occur which is obviously not true.
   The likelihood function isn't a probability density function – for example, the integral of a likelihood function isn't in general 1. In this example, the integral of the likelihood density over the interval [0,1] in pH is 1/3, demonstrating again that the likelihood density function can't be interpreted as a probability density function for pH. On the other hand, given any particular value of pH, for example pH = 0.5, the integral of the probability density function over the domain of the random variables is 1.

Likelihoods that eliminate nuisance parameters

In many cases, the likelihood is a function of more than one parameter but interest focusses on the estimation of only one or at most a few of them, with the others being considered as nuisance parameters. Several alternative ways have been developed to eliminate such nuisance parameters so that a likelihood can be written as a function of the parameter (or parameters) of interest only, the main ones being marginal, conditional and profile likelihoods.
   These are useful because standard likelihood methods can become unreliable or fail entirely when there are many nuisance parameters (or the nuisance parameter is high-dimensional), particularly when the number of nuisance parameters is a substantial fraction of the number of observations and this fraction doesn't decrease when the sample size increases. They can also be used to derive closed-form formulae for statistical tests when direct use of maximum likelihood requires iterative numerical methods, and find application in some specialized topics such as sequential analysis.

Conditional likelihood

Sometimes it's possible to find a sufficient statistic for the nuisance parameters, and conditioning on this statistic results in a likelihood which doesn't depend on the nuisance parameters.
   One example occurs in 2×2 tables, where conditioning on all four marginal totals leads to a conditional likelihood based on the non-central hypergeometric distribution. (This form of conditioning is also the basis for Fisher's exact test.)

Marginal likelihood


   Sometimes we can remove the nuisance parameters by considering a likelihood based on only part of the information in the data, for example by using the set of ranks rather than the numerical values. Another example occurs in linear mixed models, where considering a likelihood for the residuals only after fitting the fixed effects leads to residual maximum likelihood estimation of the variance components. (Note that there's a different meaning of marginal likelihood in Bayesian inference).

Profile likelihood

It is often possible to write some parameters as functions of other parameters, thereby reducing the number of independent parameters. (The function is the parameter value which maximises the likelihood given the value of the other parameters.) This procedure is called concentration of the parameters and results in the concentrated likelihood function, also occasionally known as the maximized likelihood function, but most often called the profile likelihood function. For example, consider a regression analysis model with normally distributed errors. The most likely value of the error variance is the variance of the residuals. The residuals depend on all other parameters. Hence the variance parameter can be written as a function of the other parameters.
   Unlike conditional and marginal likelihoods, profile likelihood methods can always be used (even when the profile likelihood can't be written down explicitly). However, the profile likelihood isn't a true likelihood as it isn't based directly on a probability distribution and this leads to some less satisfactory properties. (Attempts have been made to improve this, resulting in modified profile likelihood.)
   The idea of profile likelihood can also be used to compute confidence intervals that often have better small-sample properties than those based on asymptotic standard errors calculated from the full likelihood.

Historical remarks

Some early thoughts on likelihood were made in a book by Thorvald N. Thiele published in 1889. The first paper where the full idea of the "likelihood" appears was written by R.A. Fisher in 1922: "On the mathematical foundations of theoretical statistics". In that paper, Fisher also uses the term "method of maximum likelihood". Fisher argues against inverse probability as a basis for statistical inferences, and instead proposes inferences based on likelihood functions.

Further Information

Get more info on 'Likelihood Function'.


External Link Exchanges

Do you know how hard it is to get a link from a large encyclopaedia? Well we're different and will prove it. To get a link from us just add the following HTML to your site on a relevant page:

    <a href="http://likelihood_function.totallyexplained.com">Likelihood function Totally Explained</a>

Then simply click through this link from your web page. Our crawlers will verify your link, extract the title of your web page and instantly add a link back to it. If you like you can remove the words Totally Explained and embed the link in article text.
   As long as your link remains in place, we'll keep our link to you right here. Please play fair - our crawlers are watching. Your site must be closely related to this one's topic. Any kind of spamming, dubious practises or removing the link will result in your link from us being dropped and, potentially, your whole site being banned.



Copyright © 2007-8 totallyexplained.com | Licensed under the GNU Free Documentation License | Site Map
This article contains text from the Wikipedia article Likelihood function (History) and is released under the GFDL | RSS Version